A Model for Robust Chinese Parser
نویسنده
چکیده
The Chinese language has many special characteristics which are substantially different from western languages, causing conventional methods of language processing to fail on Chinese. For example, Chinese sentences are composed of strings of characters without word boundaries that are marked by spaces. Therefore, word segmentation and unknown word identification techniques must be used in order to identify words in Chinese. In addition, Chinese has very few inflectional or grammatical markers, making purely syntactic approaches to parsing almost impossible. Hence, a unified approach which involves both syntactic and semantic information must be used. Therefore, a lexical feature-based grammar formalism, called Information-based Case Grammar, is adopted for the parsing model proposed here. This grammar formalism stipulates that a lexical entry for a word contains both semantic and syntactic feature structures. By relaxing the constraints on lexical feature structures, even ill-formed input can be accepted, broadening the coverage of the grammar. A model of a priority controlled chart parser is proposed which, in conjunction with a mechanism of dynamic grammar extension, addresses the problems of: (1) syntactic ambiguities, (2) under-specification and limited coverage of grammars, and (3) ill-formed sentences. The model does this without causing inefficient parsing of sentences that do not require relaxation of constraints or dynamic extension of the grammar.
منابع مشابه
A Block-Based Robust Dependency Parser for Unrestricted Chinese Text1
Although substantial efforts have been made to parse Chinese, very few have been practically used due to incapability of handling unrestricted texts. This paper realizes a practical system for Chinese parsing by using a hybrid model of phrase structure partial parsing and dependency parsing. This system showed good performance and high robustness in parsing unrestricted texts and has been appli...
متن کاملA Block-Based Robust Dependency Parser For Unrestricted Chinese Text
Although substantial efforts have been made to parse Chinese, very few have been practically used due to incapability of handling unrestricted texts. This paper realizes a practical system for Chinese parsing by using a hybrid model of phrase structure partial parsing and dependency parsing. This system showed good performance and high robustness in parsing unrestricted texts and has been appli...
متن کاملRobust Non-Explicit Neural Discourse Parser in English and Chinese
Neural discourse models proposed so far are very sophisticated and tuned specifically to certain label sets. These are effective, but unwieldy to deploy or repurpose for different label sets or languages. Here, we propose a robust neural classifier for non-explicit discourse relations for both English and Chinese in CoNLL 2016 Shared Task datasets. Our model only requires word vectors and simpl...
متن کاملApplying Maximum Entropy to Robust Chinese Shallow Parsing
Recently, shallow parsing has been applied to various information processing systems, such as information retrieval, information extraction, question answering, and automatic document summarization. A shallow parser is suitable for online applications, because it is much more efficient and less demanding than a full parser. In this research, we formulate shallow parsing as a sequential tagging ...
متن کاملTreebank-Based Acquisition of LFG Resources for Chinese
This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof-concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially exten...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCLCLP
دوره 1 شماره
صفحات -
تاریخ انتشار 1996